Practical Algorithms for Pattern Based Linear Regression
نویسندگان
چکیده
We consider the problem of discovering the optimal pattern from a set of strings and associated numeric attribute values. The goodness of a pattern is measured by the correlation between the number of occurrences of the pattern in each string, and the numeric attribute value assigned to the string. We present two algorithms based on suffix trees, that can find the optimal substring pattern in O(Nn) and O(N) time, respectively, where n is the number of strings and N is their total length. We further present a general branch and bound strategy that can be used when considering more complex pattern classes. We also show that combining the O(N) algorithm and the branch and bound heuristic increases the efficiency of the algorithm considerably.
منابع مشابه
Modelling Climatic Parameters Affecting the Annual Yield of Rheum Ribes Rangeland Species using Data Mining Algorithms
Identification of climatic characteristics affecting the annual yield of Rheum Ribes can be useful in management and development of this species in the rangelands. In this research, the annual yield of this species in Khorasan-Razavi province based on 74 climatic parameters during a ten-year period evaluated and affecting climatic parameters extracted using data mining methods. First, the role ...
متن کاملA Combinatorial Algorithm for Fuzzy Parameter Estimation with Application to Uncertain Measurements
This paper presents a new method for regression model prediction in an uncertain environment. In practical engineering problems, in order to develop regression or ANN model for making predictions, the average of set of repeated observed values are introduced to the model as an input variable. Therefore, the estimated response of the process is also the average of a set of output values where th...
متن کاملModeling and forecasting US presidential election using learning algorithms
The primary objective of this research is to obtain an accurate forecasting model for the US presidential election. To identify a reliable model, artificial neural networks (ANN) and support vector regression (SVR) models are compared based on some specified performance measures. Moreover, six independent variables such as GDP, unemployment rate, the president’s approval rate, and others are co...
متن کاملA Thinning Method of Linear And Planar Array Antennas To Reduce SLL of Radiation Pattern By GWO And ICA Algorithms
In the recent years, the optimization techniques using evolutionary algorithms have been widely used to solve electromagnetic problems. These algorithms use thinning the antenna arrays with the aim of reducing the complexity and thus achieving the optimal solution and decreasing the side lobe level. To obtain the optimal solution, thinning is performed by removing some elements in an array thro...
متن کاملMachine learning algorithms in air quality modeling
Modern studies in the field of environment science and engineering show that deterministic models struggle to capture the relationship between the concentration of atmospheric pollutants and their emission sources. The recent advances in statistical modeling based on machine learning approaches have emerged as solution to tackle these issues. It is a fact that, input variable type largely affec...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005